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Abstract 

We derive a convex optimization problem on a steady-state nonequilibrium network of 
biochemical reactions, with the property that energy conservation and the second law 
of thermodynamics both hold at the problem solution. This suggests a new variational 
principle for biochemical networks that can be implemented in a computationally tractable 
manner. We derive the Lagrange dual of the optimization problem and use strong duality 
to demonstrate that a biochemical analogue of Tellegen's theorem holds at optimality. 
Each optimal flux is dependent on a free parameter that we relate to an elementary kinetic 
parameter when mass action kinetics is assumed. 

Keywords: constraint-based modeling, flux balance analysis, thermodynamics, convex 
optimization, entropy function 



1. Introduction 

The biochemical system of any organism can be represented mathematically by a net- 
work of chemicals (nodes) and reactions (edges). To analyze these networks at genome 
scale, systems biologists often use a linear optimization technique called flux balance anal- 
ysis (FB A) [42] . Flux balance requires that the sum of fluxes into and out of each node in 
the network be zero. This is equivalent to Kirchhoff's current law in an electrical network. 
Recent work has sought to augment flux balance analysis with Kirchhoff's loop law for 
energy conservation as well as the second law of thermodynamics [5, 36, 51, 30, 18, 43]. 
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The incorporation of thermodynamic constraints into genome-scale models has pro- 
duced models that are biologically more realistic and reveal greater insight into the control 
mechanisms operating in these complex biological systems [5, 26, 50, 19, 27]. See [45] for 
a recent broad review of the application of thermodynamic constraints to biochemical net- 
works. The second law of thermodynamics may be applied to each reversible elementary 
reaction by constraining net reaction flux to be in the direction of a negative change in 
chemical potential [9]. Constraints on net reaction flux can be incorporated within flux 
balance analysis as additional linear inequality constraints [17]. Software packages to quan- 
titatively assign reaction directionality for genome-scale models of metabolism are available 
[16, 44]. 

In contrast to the addition of constraints arising from the second law of thermody- 
namics, the addition of energy conservation constraints has been problematic because the 
resulting equations are nonlinear and/or nonconvex. Previous attempts required comput- 
ing the global solution of a nonconvex continuous optimization problem [5, 30, 18], solving 
an NP-hard problem [51], or solving a mixed integer linear program [20, 43]. Mixed integer 
programs have unpredictable computational complexity. 

The purpose of this work is to show that Kirchhoff 's loop law and the second law of 
thermodynamics arise naturally from the optimality conditions of a convex optimization 
problem with flux balance constraints. Furthermore, every set of reaction fluxes that 
satisfies Kirchhoff's loop law and the second law of thermodynamics must be optimal 
for some instance of this problem. This suggests that there is an underlying variational 
principle operating in biochemical networks. Our convex optimization formulation leads to 
polynomial-time algorithms [52] for computing steady state fluxes that also satisfy energy 
conservation and the second law of thermodynamics. 



2. Linear resistive netw^orks 



Consider a simple electrical circuit consisting of current sources, batteries, and resistors, 
as illustrated in Figure 1. This is a linear resistive network with m nodes and n edges, 
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Figure 1; A linear resistive network with currents x, potentials y, batteries b, and current sources /. 

where the node variables y G represent potentials and the edge variables x S R" 
represent flows (or currents) in the network. The circuit topology is defined by a node- 
edge incidence matrix A € IR'"^"', and properties of the network are encoded in a set of 
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data vectors: / € R'" is a vector of current sources, b £ E," is a vector of batteries, and 
r E R" is a vector of resistances (r > 0). 

To solve for the voltages and currents in the circuit we use three fundamental laws: 
Kirchhoff's current law (KCL) Ax = /, Kirchhoff's voltage (or loop) law (KVL) w = 
b + A^y, and Ohm's law w = Rx (where w G is a vector of voltages and R = diag(r) is 
a positive-definite diagonal matrix) . Maxwell's minimum heat theorem [28] is a variational 
principle that underlies this circuit. It seeks a set of currents that minimize the heat (or 
power) dissipated subject to KCL. This is the convex optimization problem [46] 

minimize F{x) = ^x'^Rx — b^x fnv>\ 
subject to Ax = f : y 

where the node variables y are Lagrange multipliers for the equality constraints. The 
optimality conditions \/F{x) = A^y yield equations that enforce KVL and Ohm's Law, 
and the optimal variables x* and y* are a set of consistent potentials and currents for the 
circuit. 

Biochemical networks are significantly more complicated than linear resistive networks. 
However, some of the same underlying network concepts apply [32]. In this work we con- 
struct an optimization problem in a form similar to problem (QP), where the potentials are 
Lagrange multipliers for an equality constraint on the fiow variables, and the optimality 
conditions of the problem yield equations that enforce Kirchhoff's loop law and the second 
law of thermodynamics. Previous work noted an "analogy" between Lagrange multipliers 
and chemical potentials in flux balance analysis [49]. The key limitation of that work, 
concerned with duality in linear optimization, is that the optimality conditions of a lin- 
ear optimization problem cannot enforce any relationship between net flux and change in 
chemical potential. In the present work, we establish a quantitative relation between the 
Lagrange multipliers and classical chemical potential for a nonlinear, yet convex, optimiza- 
tion problem in a framework consistent with classical thermodynamics. 

3. Biochemical netvi^orks 

The mathematical representation of a biochemical network is the stoichiometric matrix 
S. Like A above, S G R™"^" is a sparse incidence matrix that encodes the network topology. 
However, biochemical networks, unlike linear resistive networks, are nonlinear networks or 
hypergraphs. That is, a single edge may link many nodes to many nodes, and the entries 
in S, which are integer stoichiometric coefficients, are not confined to the set {—1,0,1}. 
Each row of S corresponds to an individual chemical compound, and each column of S 
corresponds to an individual elementary reaction. In practice, m < n and S does not 
have full row-rank. A model of a system is called genome-scale if a large proportion of the 
system's genes are represented. In current genome-scale models of the metabolic system of 
E. coli, m and n are several thousand. 
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Flux balance analysis (FBA) computes a set of fluxes tliat satisfy steady state mass- 
conservation constraints and are optimal for a biological objective function [42, 31]. A flux 
is a reaction rate; it represents flow through the network and is analogous to current in an 
electrical circuit. We denote the net flux of the jth reaction by the variable Vj G R. The 
concentration of the ith chemical in the network is denoted Xj G R. A fundamental equation 
in flux balance analysis is the dynamic mass conservation equation — a differential equation 
relating the change in chemical concentration to reaction fluxes via the stoichiometric 
matrix: 

Sv = ^. 

dt 

Here x € R'" is a vector of chemical concentrations and v G R" is a vector of net reaction 
fluxes. Each row of this vector equation states that the rate of change in concentration for 
a chemical is the sum of fluxes that synthesize or degrade that chemical. 

So far we have considered net flux for a stoichiometric matrix of reactions, each of 
which conserves mass. A living biochemical system operates in a nonequilibrium state [37] 
and exchanges mass with its surroundings. This can be modeled by including exchange 
reactions that do not conserve mass. The model is augmented with a matrix Se € R™'^'^ 
and a corresponding set of exchange fluxes Ve, which are sources and sinks of chemicals 
and are analogous to current sources in an electrical network. In contrast to the columns 
of S, each column of Se corresponds to a reaction that does not conserve mass. If we 
assume that the biochemical system is operating at a steady state, then the concentrations 
of chemicals within the system remain constant. Thus, we have 



This equation is analogous to Kirchhoff 's current law in an electrical network. Henceforth 
we assume that there exists a feasible solution {v,Ve) for (1). Ensuring the existence of a 
steady state flux is part of a quality control process during reconstruction of S and Se from 
experimental literature [48]. 

Any reversible reaction A + 2B ^ C may be split into two one-way reactions: forward 
(A + 2B — > C) and reverse {A + 2B C). To distinguish between forward, reverse, and 
exchange reactions we split the augmented stoichiometric matrix into three components: 
[S —S Se]- Here S contains all the columns corresponding to forward reactions, and —S 
contains all columns corresponding to the reverse reactions. The fluxes for these one-way 
reactions form the vectors vj and Vr- The net flux is then v = vj — Vr- 

Flux balance analysis has been implemented using linear programming [34]. We take 
the flux balance analysis problem to be 



Sv + SeVe = -rr = 
dt 




maximize d Ve 

Vf,Vr,Ve 



(FBA) 



subject to Svf — SVr + SeVe = 

Vf, Vr >0, ^ < Ve < h. 
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Often, the lower bounds i and upper bounds h on the exchange fluxes come from laboratory 
measurements {e.g. the uptake of glucose in a particular culture of E. coli). The vector d 
is chosen to optimize a biological objective {e.g. maximizing replication rate in unicellular 
organisms). Note that flux balance analysis does not explicitly solve for Vf and Vr but 
rather v = Vf — Vr. 

We now prove a lemma about alternative optimal solutions to problem (FBA). 

Lemma 1. If there is an optimal solution to problem (FBA), there is an optimal solution 
with strictly positive internal fluxes Vf and Vr- 

Proof. Let {vpV*,v*) be an optimal solution to problem (FBA). It could be the case that 
one or more components of or are zero; that is, they are sitting on their lower bounds. 
We want to show that we can construct a new optimal solution {vj,v°,v*) with strictly 
positive fluxes Vj and v°. To do this, let v'ji = + ae and v° = v* + ae, where a is any 
positive scalar and e € R" is the vector of all ones. The internal fluxes {vpV°) are strictly 
positive, and the set of fluxes {vpV°,v*) is feasible because i < v* < h and 

Svf — Sv° + SeVe = S{v*jr + oe) — S{v* + ae) + SeV* 

= Sv} - Sv* + 5e< = 0. 

Finally, the set of fluxes {vj,v°, v*) is optimal because it has the same objective value d?"v* 
as the optimal solution {vj,v*,v*). Here we used the fact that the objective function in 
problem (FBA) depends only on the exchange fluxes Vg', not on the internal fluxes Vf and 

Vr. □ 



4. Thermodynamic constraints 

Flux balance analysis predicts fluxes that satisfy steady state mass conservation but not 
necessarily energy conservation or the second law of thermodynamics. Whilst the domain 
of steady state mass conserved fluxes includes those that are thermodynamically feasible, 
additional constraints are required in order to guarantee a flux that additionally satisfies 
energy conservation and the second law of thermodynamics. Recent work has tried to 
add constraints based on Kirchhoff 's loop law and the second law of thermodynamics to 
problem (FBA) [5, 36, 30]. However, these constraints are problematic because they are 
nonlinear and nonconvex. We now describe these constraints. 

The loop law for chemical potentials in a biochemical network is directly analogous to 
Kirchhoff 's voltage law for electrical circuits. It states that the stoichiometrically weighted 
sum of chemical potentials around any closed loop of chemical reactions is zero. We ex- 
plicitly model a chemical potential G IR, for each of the chemicals in the network, and 
we define the change in chemical potential for all internal reactions in the network as the 
vector 

An = S^u (G WC), (2) 
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where u € R"* is the vector of chemical potentials. This ensures that Kirchhoff 's loop law 
is satisfied, as energy conservation requires that an injective relation exist between each 
row of S and a chemical potential [35]. It follows that the change in chemical potential 
around a stoichiometrically balanced loop will be zero. 

Assuming mass-action kinetics, constant temperature and pressure, and uniform spatial 
concentrations (i.e. a well mixed system), it is known {e.g. [41]) that the change in chemical 
potential may be expressed in terms of elementary one-way reaction rates as 

Au = plog{vr./vf), (3) 

where ./ denotes component- wise division of vectors, and p = RT > is the gas con- 
stant multiplied by temperature. Equation (3) leads directly to a macroscopic (long-term) 
application of the second law of thermodynamics: 

— Auj Vj = —Auj {Vfj — Vrj) > 0, 

which says that the net flux for each elementary reaction must be down a gradient of 
chemical potential, that the system must dissipate heat, and that entropy must increase as 
a result of work being done on the system through the exchange fluxes [37] . The total heat 
dissipation rate of the biochemical system in a non-equilibrium steady state is given by 
—Au'^v > 0. Henceforth we consider each flux as a dimensionless quantity. Dimensioned 
flux can be rendered dimensionless by division with a standard flux of the same dimensions. 
This standard may be defined according to a convenient timescale in the same way that a 
standard concentration may be defined according to a convenient abundance scale. 

Under the specified assumptions, we now define a thermodynamically feasible flux. 

Definition 1. For a network described by an augmented stoichiometric matrix [S —S Se] 
with a given set of exchange fluxes Ve, a set of thermodynamically feasible fluxes is a pair 
of internal flux vectors {vf,Vr) > that satisfy steady-state mass-balance, 

Svf - SVr = -SeVe, (4) 

and for which there exists an underlying vector of chemical potentials u € R"^ that satisfles 
(2) and (3): 

Au = S'^U = plog {Vr ■/ Vf) = plogVr — plogVf. (5) 



5. A variational principle 

We now present the main theorem of this paper. The theorem introduces a new convex 
optimization problem with the same flux balance constraints as problem (FBA) but with a 
negative entropy objective function. It states that the thermodynamic constraints (4) and 
(5) hold at its unique solution. We use e to denote a vector of ones. 
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Theorem 1. Let v* be any set of optimal exchange fluxes from problem (FBA). Define 
b = —SeV*, and let c be any vector in R". The convex equality- constrained problem 

minimize 4> = vT{log{vf) + c — e) + vJ{log{vr) + c — e) 

Vf,Vr>0 J (EP) 

subject to Svf — Svr = b : y 

is then feasible, and its solution {vj,v*) is a set of thermodynamically feasible internal 
fluxes. The combined vector (vj,v*,v*) is thermodynamically feasible and optimal for 
problem (FBA). The associated chemical potentials u may be obtained from the optimal 
Lagrange multiplier y* € R"^ for the equality constraints according to u = —2py*. 

Proof. First note that the constraints Vf,Vr > are imphed by the domain of the logarithm; 
they have no associated Lagrange multiphers. From Lemma 1, we know that if is optimal 
for problem (FBA) and b = —SgV*, there must be corresponding positive internal fluxes Vf 
and Vr that satisfy Svf — Svr = b. Therefore, problem (EP) with this choice of b is always 
feasible. 

Define the objective function as (j){vf,Vr) and note that it is strictly convex because 
V'^(f){v f ,Vr) is positive-definite for all Vf,Vr > 0, and it is bounded below. Problem (EP) is 
thus a convex linear equality-constrained problem with a unique optimal solution (vpV*) 
that satisfies the optimality conditions 

5V = V„,0 = log(i;p + c, (6) 
-S^y* = V,,.(t> = log«) + c, (7) 
Sv) - Sv* = b (8) 

for some vector y* (which is not unique because S has low row rank). Subtracting (6) 
from (7) gives S'^{—2y*) = log{v* ./ v^). Taking u = —2py* we see from (5) that {v^,v*) is 
a pair of thermodynamically feasible fluxes with underlying chemical potentials u for the 
exchange fluxes v*. The combined vector {v*,VpVg) is feasible for problem (FBA), and is 
optimal because the objective d'^v* is unchanged. □ 

In summary, to compute an optimal solution {vpV*,v*) to problem (FBA) that is 
thermodynamically feasible, perform the following steps: solve problem (FBA) to find an 
optimal exchange fiux vector v*, form b = —SeV*, choose a vector c, and solve problem 
(EP) to find VpV* > 0. Since S is row rank deficient, problem (EP) defines Au uniquely, 
but not the part of u in the nullspace of S'^. Observe that if cj){vf,Vr) were a linear function 
of flux, one could not quantitatively relate flux to Lagrange multipliers at optimality, as 
primal and dual variables do not appear in the same optimality condition. This is the key 
limitation of previous efforts to draw an analogy between chemical potential and Lagrange 
multipliers to constraints in a linear optimization problem [49]. 

Note the structural similarity of problems (QP) and (EP). In both cases the primal 
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variables are the flows in the networks, the constraints impose Kirchhoff's current law, 
and the dual variables for these constraints are the potentials in the network. There is a 
clear physical interpretation of the objective function in problem (QP) and a variational 
principle in operation. We believe there must also be a variational principle in operation 
in biochemical networks. In an abstract analysis of nonlinear resistive networks [29], one 
would refer to the objective in (EP) as the resistive content, as it is the sum of the areas 
under the constitutive relationships between flux and change in potential, for each reaction. 
From an information theoretic perspective, by maximizing the entropy of the internal fluxes, 
problem (EP) gives the most unbiased prediction [25] of internal elementary fluxes, subject 
to mass-conservation constraints and boundary conditions imposed by exchange fluxes. 

The next theorem proves that if there is a set of thermodynamically feasible fluxes in 
a biochemical network, it must be the solution to an optimization problem in the form of 
problem (EP). 

Theorem 2. Every set of thermodynamically feasible fluxes Vf, Vr (and chemical potentials 
u) is the solution (and corresponding Lagrange multiplier) of a convex optimization problem 
in the form of problem (EP). 

Proof. We show how to choose the vector c S so that the given Vf,Vr, and u are optimal. 
Since vj and Vr satisfy (4), they are feasible. From (5) we have S^u = plog{vr ./ Vf). 
Letting y = —j^u we have u = —2py and hence 

-25^2/ = log(?;,)-log(^;/). 

If we take 2c = — log(t>r) — log(t'/), this gives 

2S'^y = 21og(v/) + 2c 
-2S^y = 2log{vr) + 2c, 

which with (4) are the optimality conditions for problem (EP) as given by (6)-(8). There- 
fore, with c = — |log(t>r) — ^log(f/), the given Vf and Vr are the optimal solution, with 
Lagrange multiplier y = —-^u. □ 

The vector c in problem (EP) is a set of free parameters. Theorem 1 tells us that 
regardless of the value of c, (EP) has a unique solution that is a thermodynamically feasible 
flux. Theorem 2 states that given any thermodynamically feasible flux there exists at least 
one associated vector c. 

6. Dual variational principle 

Given the primal optimization problem (EP) it is instructive to consider the equivalent 
dual optimization problem, as its objective function lends insight into the properties of the 
optimal dual variables for the primal problem. 
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By Lemma 1, the constraints of Problem (EP) have feasible solutions that are strictly 
positive. Similarly, there must be feasible solutions that are finite, and these have finite 
objective values. Hence the optimum of the strictly convex objective function of problem 
(EP) is finite and attainable. 

Proposition 1. The Lagrange dual of problem (EP) is the unconstrained convex optimiza- 
tion problem 

maximize ip{y) = b^y — e^ exp{S'^y — c) — e^ ex.p{—S'^y — c). (DEP) 
y 

Proof. The dual for problem (EP) is derived in terms of the associated Lagrangian, 

C {vf, Vr, y) = wJ(log(w/) + c - e) + v^{\og{vr) + c - e) - y'^iSvj - Svr - b). 

The dual function ip{y) is the set of greatest lower bounds (denoted inf for infimum) of 
the Lagrangian over the primal variables. Equivalently, it is the set of least upper bounds 
(denoted sup for supremum) of the negative Lagrangian: 

^{y)= inf C{vf,vr,y) = - swp -C{vf,vr,y). 

^/I'^r Vf,Vr 

The supremum of the linear function is itself, and the other terms may be grouped to 
give 

V'(y) = b^y - sup {y^Svf - vJ{log{vf) + c - e)) - sup {-y'^Svr - vJ{log{vr) + c - e)) . 

(9) 

The first supremum is attained when the partial derivative with respect to is zero: 

S^y -log{vf) - c = <^ f/ = exp(5^y - c). (10) 
Similarly for the second supremum, 

- S"^ y - log{vr) - c = Vr = exp{- S'^ y - c) . (11) 

Substituting (lO)-(ll) into (9) gives the Lagrange dual problem (DEP). □ 
The first and second derivatives of i^iy) are 

Vip{y) = b — S exp{S^y — c) + S'exp(— — c), 

vV(y) = -SDS'^, 

where D = diag(exp(5-^y — c) + exp(— — c)) is a positive-definite diagonal matrix for 
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all finite y (so that V'^ip{y) is negative semidefinite). The necessary first- and second-order 
conditions for a point y = y* to be an optimum are that S/ip{y*) = and V^^(y*) be 
negative semidefinite. With vj = exp(5"^y* — c) and v* = exp{—S^y* — c) we see that 
{vj,v*,y*) satisfy the optimality conditions for both (EP) and (DEP). 

The dual objective ip{y) offers a complementary insight into the meaning of our vari- 
ational principle. The chemical potential is defined hy u = —2py*. Up to scalar multipli- 
cation, —b'^y = ^b'^u corresponds to the rate of work being done by the environment to 
maintain the system away from equilibrium [38]. Therefore, one may interpret the Lagrange 
dual problem as a minimization of this rate of work, balanced against minimization of the 
sum of thermodyamically feasible forward and reverse fluxes. We note that minimization of 
a weighted linear combination of forward or reverse fluxes, where thermodynamic equilib- 
rium constants are used as weighting factors, has previously been explored as an optimality 
principle for metabolic networks [22]. 

A comprehensive introduction to convex optimization is given in [8]. Problem (EP) is 
an example of a monotropic optimization problem [39] . Further discussion of such problems 
(and duality) can be found in [7]. 

6.1. Strong duality and Tellegen's theorem 

Since problem (EP) is convex with a finite attainable optimum objective and linear 
constraints, it is known that strong duality holds [7]. Among other things, strong dual- 
ity means that the values of the primal and dual objectives are equal at the optimum: 
4>{v*pV*) = il^{y*). Omitting the *s and using (lO)-(ll), we have 

v^{\og{vf) + c — e) + {\og{vr) + c — e) = b'^y — exp{S'^y — c) — exp{—S'^y — c) 

= b^y — e'^Vf — e^Vr- 

Thus, 

v^j{\og{vf) + c) + v'^{\og{vr) + c) = b^y. (12) 
Also from (lO)-(ll) we have 

c = - log(t>/) + S'^y = - \og{vr) - S'^y. 

With u = —2py, (12) becomes 

S{Vf — Vr) = u^b. (13) 

On the left of (13) is the rate of entropy production by the biochemical network, and on 
the right is the rate of work done by the environment to maintain the system away from 
equilibrium. Their equality means that the rate at which heat is produced by chemical 
reactions equals the rate at which heat is dissipated into the environment, so the temper- 
ature of the system is time-invariant. In the thermodynamic literature, (13) is known as 
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the isothermal Clausius equahty [38], whereas from electrical network literature one may 
recognize (13) as a biochemical version of Tellegen's theorem. 

Oster and Desoer [33] previously recognized that a reaction flux vector satisfying Kirch- 
hoff 's current law and a chemical potential vector satisfying Kirchhoff 's voltage law is nec- 
essary and sufficient for Tellegen's theorem to hold for a system of biochemical reactions. 
Although Tellegen's theorem holds irrespective of any constitutive relationship between flux 
and chemical potential, if one first assumes that unidirectional flux is a (strictly) mono- 
tonically increasing function of change in chemical potential, then one obtains flux and 
potential vectors satisfying Kirchhoff's laws with a single (strictly) convex optimization 
problem [13]. The challenge is to pose an optimization problem with optimality conditions 
that enforce a constitutive relationship between flux and potential matching established 
theory in chemical kinetics. 

7. Thermodynamic feasibility and mass action kinetics 

From the optimality conditions for problem (EP), observe that forward flux is a func- 
tion of substrate and product chemical potentials. For an elementary reaction, one would 
expect that forward flux should depend only on substrate chemical potential(s) and reverse 
flux should depend only on product chemical potential(s) [14]. According to mass action 
kinetics, the rate of an elementary forward reaction only depends on a forward kinetic 
parameter and substrate concentration(s) [10]. In the objective of problem (EP), for sim- 
plicity of exposition, consider replacing c^{vf + Vr) with cTfj + c^Vr- When Cf — Cr is 




constrained to lie in the range of S'^ , this provides another mechanism for satisfying (5). 
Here we show that one may choose Cf,Cr such that mass action kinetics also holds. 

The chemical potential of the ith. metabolite in dilute solution at constant temperature 
and pressure can be expressed as 



where u° is the standard chemical potential, Xi is the molar concentration of the metabolite, 
and x" is a reference concentration that we define to be one molar. The standard chemical 
potential is therefore the chemical potential of a metabolite in a reference state. Assume we 
are given forward and reverse elementary kinetic parameter vectors kf,kr G IR*^ satisfying 
the thermodynamic feasibility condition 



for a particular standard chemical potential vector u" E R™. Mass action kinetics may be 





(14) 




(15) 
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expressed as a set of vectors Vf,Vr,kf,kr,x that satisfy 

log(wj) = log(A;/) + F'^log(a;), 

log{Vr) = log(/Cr) + i?'^log(x), 

where the stoichiometric matrix has been decomposed into the entry-wise difference be- 
tween forward and reverse stoichiometric matrices, respectively F,R^ lR>o"i defined by 

Sij = ^ {F,j = 0, Rij = 0}, 
Sij < — > {Fij = —Sij, Rij = 0}, 
Sij > — > {Fij = 0, Rij = Sij}. 

With respect to the forward reaction direction, each column of F contains the absolute 
value of the stoichiometric coefficient for each substrate, and the corresponding column of 
R contains the stoichiometric coefficient for each product. 

Given a consistent stoichiometric matrix and thermodynamically feasible kinetic pa- 
rameters, it is still an important open question whether for all b € 7^(5") there exists a 
metabolite concentration vector x satisfying steady-state mass action kinetics, namely 

5exp(log(A;/) + F^log(x)) - S'exp(log(/cr) + log(x)) = b. (16) 

Ongoing work aims to establish the necessary and sufficient conditions for this to be so [1]. 
Nonetheless, assuming (16) is satisfiable, there exist Cf and Cr such that 

Cf = - log(/cj) + R^y*, Cr = - \og{kr) + F^y*, 

and optimality conditions (lO)-(ll) become 

log{v}) = log{kf) - F^y\ log«) = log(A:,) - R^y\ 

The optimal dual vector may then be equated with the negative of logarithmic concen- 
tration y* = — log(3;), leading to satisfaction of mass action kinetics in a non-equilibrium 
steady state. 

8. Discussion 

With problem (EP) and Theorem 1 we provide a new variational principle that enforces 
steady state mass conservation, energy conservation and the second law of thermodynamics 
for genome-scale biochemical networks. Moreover, we prove that all thermodynamically 
feasible steady state fluxes are instances of problem (EP) for a different free parameter 
vector c € R" (Theorem 2). The values of the free parameters influence the optimal 
forward and reverse fluxes through the relation v*r .*v* = exp(— 2c). Varying c may change 
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v'j and V* but not the fact that the corresponding solution is thermodynamically feasible 
and optimal for flux balance analysis. 

To compute a thermodynamically feasible flux we must first choose a value for the 
vector c. Thus, there is some freedom left in the model. We see this property of the model 
as an advantage rather than a disadvantage, as thermodynamic feasibility is necessary but 
not sufficient for satisfaction of mass action kinetics [23]. In the objective of problem (EP), 
one could replace {v f + Vr) by CjVj + cjvr- Assuming the existence of a mass action 
kinetic non-equilibrium steady state, section 7 demonstrates that this state corresponds to 
a particular choice of c/,Cr given particular kinetic parameters kf,kr. Kinetic parameters 
evolve, in the biological sense, subject to thermodynamic feasibility (15). Even if Cf,Cr 
could be chosen such that mass action kinetics holds, the actual kf,kr that pertain to a 
particular organism's biochemical network would remain unknown. This problem is not 
particular to our modeling approach — it reflects a paucity of suitable enzyme kinetic data 
in biochemistry generally [11]. 

In addition to the constraints in problem (FBA), flux balance analysis often includes 
extra inequalities on the net flux of internal reactions. It is not possible to incorporate 
explicit inequality constraints on net flux into problem (EP) because if any such inequalities 
were active at the optimum, the corresponding nonzero dual variables would appear in 
a modification of the optimality conditions (lO)-(ll) and interfere with satisfaction of 
the thermodynamic constraints (2) that we know must hold. In practical applications of 
problem (EP) to genome-scale biochemical networks, for arbitrarily chosen c, the omission 
of explicit bounds on net flux leads to a subset of net fluxes with directions opposite to 
that known biochemically. 

At this point, one might think that an obvious application would be to search for 
free parameters cj, Cr that minimize the Euclidean distance between predicted and ex- 
perimentally determined change in chemical potentials or fluxes. In E. coli metabolism 
[17], for example, one can experimentally quantify transformed reaction Gibbs energy (in 
vivo change in chemical potential) using quantitative measurement of absolute concen- 
trations [6], combined with experimentally derived [2, 3] or group contribution estimates 
[21, 20, 24, 15] of standard transformed reaction Gibbs energy. However, obtaining in vivo 
chemical potentials is not a limitation to validation of the practical utility of problem (EP). 
Even if one could find free parameters c/, corresponding to predicted chemical potentials 
close to experimental data, the corresponding fiuxes might not satisfy mass action kinetics, 
which is known to hold for elementary chemical reactions. 

A reliable algorithm for satisfaction of mass action kinetics at a non-equilibrium steady 
state is the subject of ongoing work [1]. Given a consistent stoichiometric matrix S, for 
there to exist a non-equilibrium steady state it is necessary that the boundary condition 
be in the range of the stoichiometric matrix: b € TZ{S). Given thermodynamically feasible 
kinetic parameters in addition, it is an important open problem to establish if 6 € 7^(5') 
is also sufficient for there to exist a non-equilibrium steady state satisfying mass action 
kinetics. It may be that the set of feasible boundary conditions is a (proper) subset of the 
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range of S. In fact, the boundary condition obtained from flux balance analysis b = —SeV* 
is an underdetermined parameter for problem (EP) because v* may not be unique, given 
the absence of strict convexity in problem (FBA). Problem (EP) is well defined for any 
choice of linear objective function d?"ve in problem (FBA). Different values of d simply yield 
different vectors h = —SeV*. Problem (EP) may also be used with other non-FBA methods 
that calculate optimal exchange fluxes v* provided these methods yield a b £ TZ{S). 

We consider problem (EP) to represent a mapping between a set of parameters and 
the corresponding set of thermodynamically feasible fluxes and potentials. More precisely, 
when S is row reduced (to have full row rank), problem (EP) represents a single- valued 
surjective saddle point mapping [40] between a parameter vector (c, b) and the correspond- 
ing primal-dual optimal vector {vj,v*,y*). That is, each parameter vector corresponds to 
a unique primal-dual optimal vector, and every optimal vector is associated with at least 
one parameter {c,b). Moreover, under the same conditions, this saddle point mapping is 
locally Lipschitz continuous [12]. That is, the optimal vector of fluxes and potentials is 
smoothly perturbed by a smooth perturbation of the parameter vector. 

The properties discussed in the previous paragraph motivate future efforts to design 
algorithms for gradient-based optimization of the parameter vector. We envisage a conver- 
gent sequence of parametric convex optimization problems whose final optimum satisfies 
mass-action kinetic constraints. However, the necessary and sufficient conditions for con- 
vergence of a sequence of parametric convex optimization problems is still an active research 
area within convex analysis. Exploitation of the properties of problem (EP) may provide 
the route to such an algorithm. 

The computational tractability of linear flux balance analysis (problem (FBA)) is one 
of the key reasons for its widespread use as a genome-scale modeling tool. The theorems 
of this paper provide the theoretical basis for efficiently computing thermodynamically 
feasible fluxes, even for the largest genome-scale biological networks [47]. In particular: 

• Problem (EP) is convex and includes only linear constraints. It is feasible whenever 
problem (FBA) is feasible, and its optimal solution is then unique. 

• Efficient polynomial-time algorithms [4] exist for solving convex optimization prob- 
lems of this form, based on interior methods [52]. 

• For both (FBA) and (EP), these algorithms are guaranteed to return an optimal 
solution, or a certificate that the original problem is infeasible. 

In practice, computing thermodynamically feasible fluxes using convex flux balance analysis 
described herein should take no longer than performing linear flux balance analysis. 

Although the ultimate value of this work lies in what is accomplished with it in future 
biological studies, we believe it makes an important step forward by providing the first 
computationally tractable method for implementing energy conservation and the second 
law of thermodynamics for genome-scale biochemical networks in a non-equilibrium steady 



14 



state. We also establish, in an exact manner, the duality relationship between reaction 
rates and chemical potentials. Furthermore, our theorems extend to any potential network 
where flux balance holds and the change in potential can be written as a difference in a 
monotone function of fluxes: Au = g{vr) — g{vf), where g{-) : — >■ R™' is monotone [39]. 
These potential networks admit a convex optimization model whose solution is unique and 
whose potentials are Lagrange multipliers for the flux balance constraint. 
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